14 research outputs found
Learning Fair Naive Bayes Classifiers by Discovering and Eliminating Discrimination Patterns
As machine learning is increasingly used to make real-world decisions, recent
research efforts aim to define and ensure fairness in algorithmic decision
making. Existing methods often assume a fixed set of observable features to
define individuals, but lack a discussion of certain features not being
observed at test time. In this paper, we study fairness of naive Bayes
classifiers, which allow partial observations. In particular, we introduce the
notion of a discrimination pattern, which refers to an individual receiving
different classifications depending on whether some sensitive attributes were
observed. Then a model is considered fair if it has no such pattern. We propose
an algorithm to discover and mine for discrimination patterns in a naive Bayes
classifier, and show how to learn maximum likelihood parameters subject to
these fairness constraints. Our approach iteratively discovers and eliminates
discrimination patterns until a fair model is learned. An empirical evaluation
on three real-world datasets demonstrates that we can remove exponentially many
discrimination patterns by only adding a small fraction of them as constraints
Group Fairness by Probabilistic Modeling with Latent Fair Decisions
Machine learning systems are increasingly being used to make impactful
decisions such as loan applications and criminal justice risk assessments, and
as such, ensuring fairness of these systems is critical. This is often
challenging as the labels in the data are biased. This paper studies learning
fair probability distributions from biased data by explicitly modeling a latent
variable that represents a hidden, unbiased label. In particular, we aim to
achieve demographic parity by enforcing certain independencies in the learned
model. We also show that group fairness guarantees are meaningful only if the
distribution used to provide those guarantees indeed captures the real-world
data. In order to closely model the data distribution, we employ probabilistic
circuits, an expressive and tractable probabilistic model, and propose an
algorithm to learn them from incomplete data. We evaluate our approach on a
synthetic dataset in which observed labels indeed come from fair labels but
with added bias, and demonstrate that the fair labels are successfully
retrieved. Moreover, we show on real-world datasets that our approach not only
is a better model than existing methods of how the data was generated but also
achieves competitive accuracy
Recommended from our members
Probabilistic Reasoning for Fair and Robust Decision Making
Automated decision-making systems are increasingly being deployed in areas with high personal and societal impact. This naturally led to growing interest in trustworthy artificial intelligence (AI) and machine learning (ML), encompassing many fields of research including algorithmic fairness, robustness, explainability, privacy, and more. These works share a common theme of questioning and moderating the behavior of automated tools in various real-world settings, which inherently exhibit different uncertainties.This dissertation explores how probabilistic modeling and reasoning as a framework offer a principled way to handle uncertainties when addressing trustworthy AI issues, in particular by explicitly modeling the underlying distribution of the world. The main contributions are as follows. First, it demonstrates that many problems in trustworthy AI can be cast as probabilistic reasoning tasks of varying complexities. Secondly, it proposes algorithms to learn fair and robust decision-making systems, while handling many sources of uncertainties such as missing or biased labels at training time and missing features at prediction time. The proposed approach relies heavily on probabilistic models that are expressive enough to describe the world underlying the system, whilst being tractable enough to answer various probabilistic queries. The final contribution of this thesis is showing that probabilistic circuits are an effective model for this framework and expanding their reasoning capabilities even further
Probabilistic Reasoning and Learning for Trustworthy AI
As automated decision-making systems are increasingly deployed in areas with personal and societal impacts, there is a growing demand for artificial intelligence and machine learning systems that are fair, robust, interpretable, and generally trustworthy. Ideally we would wish to answer questions regarding these properties and provide guarantees about any automated system to be deployed in the real world. This raises the need for a unified language and framework under which we can reason about and develop trustworthy AI systems. This talk will discuss how tractable probabilistic reasoning and learning provides such framework.
It is important to note that guarantees regarding fairness, robustness, etc., hold with respect to the distribution of the world in which the decision-making system operates. For example, to see whether automated loan decisions are biased against certain gender, one may compare the average decision for each gender; this requires knowledge of how the features used in the decision are distributed for each gender. Moreover, there are inherent uncertainties in modeling this distribution, in addition to the uncertainties when deploying a system in the real world, such as missing or noisy information. We can handle such uncertainties in a principled way through probabilistic reasoning. Taking fairness-aware learning as an example, we can deal with biased labels in the training data by explicitly modeling the observed labels as being generated from some probabilistic process that injects bias/noise to hidden, fair labels, particularly in a way that best explains the observed data.
A key challenge that still needs to be addressed is that: we need models that can closely fit complex real-world distributions—i.e. expressive—while also being amenable to exact and efficient inference of probabilistic queries—i.e. tractable. I will show that probabilistic circuits, a family of tractable probabilistic models, offer both such benefits. In order to ultimately develop a common framework to study various areas of trustworthy AI (e.g., privacy, fairness, explanations, etc.), we need models that can flexibly answer different questions, even the ones it did not foresee. This talk will thus survey the efforts to expand the horizon of complex reasoning capabilities of probabilistic circuits, especially highlighted by a modular approach that answers various queries via a pipeline of a handful of simple tractable operations
Women's Mobility, Travel, and Literary Representations in the Long Eighteenth Century
My dissertation recovers women’s increased mobility and widened geography represented in British literature of the long eighteenth century when there was a revolution in travel culture. My project revises the current scholarship of eighteenth-century travel literature that centered around a few canonical male writers by recovering the rich tradition of late seventeenth- and early eighteenth-century women’s travel writings. In chapter one, I focus on several issues of the current model of women’s travel writing and propose incorporating a broader range of texts such as manuscript travelogues and travel fiction for the recovery of more women travel writers in early modern period. Chapter two analyzes single women’s domestic travel represented in Manley’s Letters (1696) and Davys’s The Fugitive (1705), and presents the way how professional women writers used the genre of travel writing to create respectable authorial images in their early careers. In chapter three, I attend to Penelope Aubin’s international travel fiction of the 1720s, arguing that she participates in early-modern knowledge productions of the world through her fictional representations of women’s transnational trajectories in exotic spaces such as the Islamic world and the Far East. Chapter four investigates the reception history of Lady Mary Wortley Montagu whose images as an iconic British female traveler were consumed by the contemporaries and later generations in various ways such as an unfeminine, or immoral traveler. By rewriting the history of travel literature with a focus on women’s travels in the long eighteenth century, my project not only challenges the contemporary discourse that associates women’s travels with danger, but also demonstrates how eighteenth-century readers formed a market for the narratives highlighting women’s increased mobility at home and abroad
When Bad Becomes Good: The Role of Congruence and Product Type in the CSR Initiatives of Stigmatized Industries
This study examines how congruence and product type affect consumer responses in the context of the corporate social responsibility (CSR) of stigmatized industries. This study conducted two experiments with college students to assess the effects of congruence and CSR type (cause-related marketing vs. advocacy advertising) (Study 1) and effects of congruence and product type (hedonic vs. utilitarian) (Study 2) on consumers’ corporate evaluations. The results of Study 1 showed that congruence generates weaker attributions of corporate altruistic motives and greater negative perceptions of corporate credibility and attitude than incongruence. However, there was no significant effect of CSR type on consumers’ evaluations of a company. The findings of Study 2 revealed significant main effects for product type. When CSR initiatives are associated with hedonic products, consumers’ altruistic attributions, credibility, and attitudes toward the product are more negative than when linked to utilitarian products in stigmatized industries. Moreover, there was a marginally significant interaction effect of congruence and product type on attitude toward the company. This study contributes to the emerging body of CSR literature in stigmatized industries by empirically determining how congruence and hedonic product type may cause backlash effects
Certifying Fairness of Probabilistic Circuits
With the increased use of machine learning systems for decision making, questions about the fairness properties of such systems start to take center stage. Most existing work on algorithmic fairness assume complete observation of features at prediction time, as is the case for popular notions like statistical parity and equal opportunity. However, this is not sufficient for models that can make predictions with partial observation as we could miss patterns of bias and incorrectly certify a model to be fair. To address this, a recently introduced notion of fairness asks whether the model exhibits any discrimination pattern, in which an individual—characterized by (partial) feature observations—receives vastly different decisions merely by disclosing one or more sensitive attributes such as gender and race. By explicitly accounting for partial observations, this provides a much more fine-grained notion of fairness.
In this paper, we propose an algorithm to search for discrimination patterns in a general class of probabilistic models, namely probabilistic circuits. Previously, such algorithms were limited to naive Bayes classifiers which make strong independence assumptions; by contrast, probabilistic circuits provide a unifying framework for a wide range of tractable probabilistic models and can even be compiled from certain classes of Bayesian networks and probabilistic programs, making our method much more broadly applicable. Furthermore, for an unfair model, it may be useful to quickly find discrimination patterns and distill them for better interpretability. As such, we also propose a sampling-based approach to more efficiently mine discrimination patterns, and introduce new classes of patterns such as minimal, maximal, and Pareto optimal patterns that can effectively summarize exponentially many discrimination patterns